The Generalization Paradox of Ensembles
نویسندگان
چکیده
Ensemble models—built by methods such as bagging, boosting, and Bayesian model averaging—appear dauntingly complex, yet tend to strongly outperform their component models on new data. Doesn’t this violate “Occam’s razor”—the widespread belief that “the simpler of competing alternatives is preferred”? We argue no: if complexity is measured by function rather than form—for example, according to generalized degrees of freedom (GDF)—the razor’s role is restored. On a two-dimensional decision tree problem, bagging several trees is shown to actually have less GDF complexity than a single component tree, removing the generalization paradox of ensembles.
منابع مشابه
2 00 3 Exact remote state preparation for multiparties
We discuss the exact remote state preparation protocol of special ensembles of qubits at multiple locations. We also present generalization of this protocol for higher dimensional Hilbert space systems for multiparties. Using the ‘dark states’, the analogue of singlet EPR pair for multiparties in higher dimension as quantum channel, we show several instances of remote state preparation protocol...
متن کاملCoherent Transport of Single Photon in a Quantum Super-cavity with Mirrors Composed of Λ-Type Three-level Atomic Ensembles
In this paper, we study the coherent transport of single photon in a coupled resonator waveguide (CRW) where two threelevel Λ-type atomic ensembles are embedded in two separate cavities. We show that it is possible to control the photon transmission and reflection coefficients by using classical control fields. In particular, we find that the total photon transmission and reflection are achieva...
متن کاملEnsemble strategies to build neural network to facilitate decision making
There are three major strategies to form neural network ensembles. The simplest one is the Cross Validation strategy in which all members are trained with the same training data. Bagging and boosting strategies pro-duce perturbed sample from training data. This paper provides an ideal model based on two important factors: activation function and number of neurons in the hidden layer and based u...
متن کاملUsing Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error
It is well known that ensembles of predictors produce better accuracy than a single predictor provided there is diversity in the ensemble. This diversity manifests itself as disagreement or ambiguity among the ensemble members. In this paper we focus on ensembles of classifiers based on different feature subsets and we present a process for producing such ensembles that emphasizes diversity (am...
متن کاملBounds on the Generalization Performance of Kernel Machines Ensembles
We study the problem of learning using combinations of machines. In particular we present new theoretical bounds on the generalization performance of voting ensembles of kernel machines. Special cases considered are bagging and support vector machines. We present experimental results supporting the theoretical bounds, and describe characteristics of kernel machines ensembles suggested from the ...
متن کامل